Skip to content

GH-16583: Access GLM Variance-Covariance Matrix with vcov#16586

Open
manh4wk wants to merge 6 commits intoh2oai:masterfrom
manh4wk:gh-16583_vcov
Open

GH-16583: Access GLM Variance-Covariance Matrix with vcov#16586
manh4wk wants to merge 6 commits intoh2oai:masterfrom
manh4wk:gh-16583_vcov

Conversation

@manh4wk
Copy link

@manh4wk manh4wk commented Mar 6, 2025

Made the variance-covariance matrix for GLMs part of the model_output results so they're accessible by Python and R. The matrix is rearranged in h2o-algos/src/main/java/hex/schemas/GLMModelV3.java so that the Intercept is both the first row and the first column, similar to how it's done for the GLM coefficient results in the same area of the code.

This matrix is now accessible with the glm_model_object.vcov() function in Python and with h2o.vcov(glm_model_object) in R.

This change fixes #16583

@manh4wk
Copy link
Author

manh4wk commented Mar 18, 2025

@tomasfryda Do you know if there's anything else I can do right now to see if this passes all the tests, etc.? I saw you mentioned in another thread the team is pretty busy at the moment.

@tomasfryda
Copy link
Contributor

@manh4wk I don't think so but I'm no longer active in h2o-3 development so it'd be better to ask @valenad1 or @maurever . Personally, I would like to thank you for taking time and contributing to open-source but I have no idea when or if it will get merged.

@manh4wk
Copy link
Author

manh4wk commented Dec 29, 2025

Hi @maurever or @valenad1, Can one of you take a look at this? Having the GLM's variance-covariance matrix available will let us do things like run a Wald test on two different levels of a categorical variable to see if they should be treated as statistically different, or if they should be combined into a single category.

Copy link
Contributor

@tomasfryda tomasfryda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good idea to expose variance-covariance matrix but that probably depends on @valenad1's decision.
If he agrees, I would suggest fixing R tests and making sure column names and row names are the same - currently column names are always lowercased (IIRC this can be caused by the TwoDimTableV3 so I would consider choosing different data structure (e.g. H2OFrame).), row names aren't.
For example the Intercept vs intercept:
Screenshot 2026-01-02 at 13 41 54

Note that I didn't do complete review, I just looked at the R part of the PR.

manualYear <- mFV@model$coefficients_table$year

# compare values from model and obtained manually
for (ind in c(1:length(manuelPValues)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

manuelPValues doesn't seem to be defined anywhere. Also, I would recommend to use seq_along(x) instead of 1:length(x) (when the x is empty, the latter will produce c(1, 0)).

Comment on lines +64 to +66
doTest("GLM: make sure error is generated when a gbm model calls glm functions", testGBMvcov)
doTest("GLM: make sure error is generated when compute_p_values=FALSE", testGLMvcovcomputePValueFALSE)
doTest("GLM: test variance-covariance values", testGLMPValZValStdError)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer something like:

doSuite("GLM: VCOV support", makeSuite(testGBMvcov, testGLMvcovcomputePValueFALSE, testGLMPValZValStdError))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no test that would test if the implementation is working. It just tests if it throws an error if used when unsupported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expose GLM Variance-Covariance Matrix

3 participants